Frequency Domain Coding of Speech

نویسنده

  • RONALD E. CROCHIERE
چکیده

Frequency domain techniques for speech coding have recently received considerable attention. The basic concept of these methods is to divide the speech into frequency components by a filter bank (sub-band coding), or by a suitable transform (transform coding), and then encode them using adaptive PCM. Three basic factors are involved in the design of these coders: 1) the type of the filter bank or transform, 2) the choice of bit allocation and noise shaping properties involved in bit allocation, and 3) the control of the step-size of the encoders. This paper reviews the basic aspects of the design of these three factors for sub-band and transform coders. Concepts of short-time analysis/synthesis are first discussed and used to establish a basic theoretical framework. It is then shown how practical realizations of subband and transform coding are interpreted within this framework. Principles of spectral estimation and models of speech production and perception are then discussed and used to illustrate how the “side information” can be most efficiently represented and utilized in the design of the coder (particularly the adaptive transform coder) to control the dynamic bit allocation and quantizer step-sizes. Recent developments and examples of the ‘Vocoder-driven” adaptive transform coder for low bit-rate applications are then presented.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Frequency Domain Linearly Constrained Minimum Variance Filter for Speech Enhancement

A reliable speech enhancement method is important for speech applications as a pre-processing step to improve their overall performance. In this paper, we propose a novel frequency domain method for single channel speech enhancement. Conventional frequency domain methods usually neglect the correlation between neighboring time-frequency components of the signals. In the proposed method, we take...

متن کامل

Single-Carrier Frequency-Domain Equalization for Orthogonal STBC over Frequency-Selective MIMO-PLC channels

In this paper we propose a new space diversity scheme for broadband PLC systems using orthogonal space-time block coding (OSTBC) transmission combined with single-carrier frequency-domain equalization (SC-FDE). To apply this diversity technique to PLC channels, we first propose a new technique for combining SC-FDE with OSTBCs applicable to all dispersive multipath channels impaired by impulsive...

متن کامل

Wide-Band Audio Coding Based on Frequency-Domain Linear Prediction

In this paper, we re-visit an original concept of speech coding in which the signal is separated into the carrier modulated by the signal envelope. A recently developed technique, called frequency domain linear prediction (FDLP), is applied for the efficient estimation of the envelope. The processing in the temporal domain allows for a straightforward emulation of the forward temporal masking. ...

متن کامل

Enhanced harmonic coding of speech with frequency domain transition modelling

A major source of audible distortion in current low-bit-rate harmonic speech coding algorithms is the ineffective modeling of the transitional speech signals such as onsets, plosives etc.. A new method of modeling transitional speech based on a frequency domain approach is introduced in this paper. The approach uses a modified harmonic model able to produce non-periodic pulse sequences in conju...

متن کامل

Combined harmonic and waveform coding of speech at low bit rates

In this paper we present a new approach for speech coding, which combines frequency-domain harmonic coding for periodic and \noise like" unvoiced segments of speech with a time-domain waveform coder for transition signals. This hybrid coder requires special handling of the boundary between voiced and transition segments. We outline the details of a 4 kbps hybrid coder and present subjective qua...

متن کامل

Integrated speech enhancement and coding in the time-frequency domain

This paper addresses the problem of merging speech enhancement and coding in the context of an auditory modeling. The noisy signal is rst processed by a fast wavelet packet transform algorithm to obtain an auditory spectrum, from which a rough masking model is estimated. Then, this model is used to re ne a subtractive-type enhancement algorithm. The enhanced speech coe cients are then encoded i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002